BD 1»8 302 

AOTHOB . 
-II TLB . • 

SPONS AGENCY 

POB .DATE. 
CONTRACT* 
NOJE ■ 

EDBS* PRICE 
DESCRIPTORS 



DOCUBBIT. tBSOBS 



IB 005 252 



Gershaan, Anatole V. , = ' " 

Analyzing English .Roan Groups for Their Conceptual 

Content-. , , •. • 

Advanced Research Projects igency (DOD) , Washington, 

D.C. 

77 

N000ia-75-C-1111 
'39p. 

HP-$d.83 HC-$2.06 P^us Postage., 

♦Artificial -Intelligence; *Co«pdtatj.onal Linguistics; 
Computer Programs; *Hachine Translation; *HoBinals; 
Phrase- Structure; *Prograiiing Languages 



ABSTRACT . • * 

An expectation based systea, HGP, f or T>arsing English 
noun groups into the Conceptual Dependency representation is 
described.. The systea* is a part of English-L'aiiguage Interpreter (ELI) 
which is used as the front end' to several natural language 
understanding systeas and is capable of handling a wide 'range of 
sentences of considerable coaplexity. HGP processes the input froa 
left to right, ope word at a tiae, using linguistic and world 
knowledge to find the aeaning of a noun group.. Dictionary entries for 
indxvidqai words .contaia auch of the prograa's knowledge. In 
addition, a liaited ability fo^ the handling of slightly incorrect 
sentences and unknown .words' is incorporated. (Author) 



* Docuaents acquired by EBIC include aany inforaal an 

* materials not available froa; other sources. ERid aak6s 

* to oBtain the 'best co^y available. Nevertheless, iteas 

* repr'oduGibility" are jrften ei^countef«d and this .affects 
*'**r£«it&€ aicrofiche' and hardcbpy reprodoctions .EllC a^kes 

* via the ERIC Dacviaent Reproduction Service- ;M>iS) . 2DBS 
responsi,ble for the quality of the origip,al%dSaa€nt. B 
supplied by EDRS are the best thaf.c^in be a^^froa .the 



ERLC 



published 
ev^ry effort ♦ 
of ■arginal ' ♦ 
the qui^lity ♦ 
ajrailable * 
is not * 
eprodttctions ♦ 
original • ♦ 

) 



■•V 

9 



rsl 
O 

00 



us 0»PAIITMCNT OF HEALTH. ^ 
COUCA'TlONAWCLFAIIC 
NATIONAL INSTITUTC OF , 

^ COUCATION 

THIS OOCoMfNT MAS BEEN REPRO- 
DUCED EXXCTLV AS RECEIVED PROM 
THE PEMON OR ORGANIZATION ORIGIN- 
ATiNG IT POINTS O^ VIEW OR OPiN-iONS 
"states 00 NOT NECpSSARiLV REPRE- 
SENTOP^ICIAL NATIONAL iNSTlTUTEOP 
EOUCATiOH POSITION 0(^*P0liCy 



<? « « ' . 

O Analyzing English Noun Groups 
for their Conceptixal ^Conterft 

Anatole V» Gershtnah- 

4 V ^ 




f 



Analyi^ng English Noun' Groups': 
for their Conceptual Content 




• Anatole V. Gershman 
Department lol Computer .Science 
Yale University 
New- Haven, Connecticut 06520 



Abstract 



SCOPE OF INTEREST NOTICE 
Th« ERtC Faolity Has >w*9n«d 




In our judgement, thtt document 
IS also of interest to the c)earirv9- 
houses noted to the right. Index- 
ing should reflect their specul 
povnts of view. , 





An expectation-based system^* NGP) *fx>r parsing English noun- 
groups into the Conceptual Dependenx:y representation is 
described. The system is a. .part of ELI (English Language 
Interpreter) which is used as thre front end *fco .several natura^l 
languag^ understanding .systems and is capable of handMng' a wide 
range -of^ sentences of considerable cbmpJexity. NGP processes the 
input fvrom left to' right, one word at a time, using '^linguistic, 
and world ' knowledge to find the .meaning- of a noun gt£>up.^ 
Dictionary entries for individual words contain much of -the 
program* s knowledge.. In addition, a limited ability for the- 
handling of sl.ightly incorrect sentences ^ and unknown words ' is 
incorporated^^ «, ' y * 



^ ; 

1 V 



•? I- 



/ 



0. Introduction 



• ' i P. 



. Every natyral language processor has to hav§ the ability to 
' interpret noun phrases. This paper describes a" set of programs 
called NGP (Noun Group Processor) which is an ^integral" part of" 
^I, .the English Language Interpreter (Ries beck and SchSnJc 1976) 
Which serves as £he f ront end - to three ' of the Yale natural 
Janguage understanding systems, PAM gnd WEIS. SAM is a 

sysfee'ra capable of understanding stjories such" a^ various' newspaper 
•repoi;tS by using scripts (Schank ,and Abelepn 1*975 f 1977; 
Cullingford 19*^5, 1977). PAM is an • undel:?tandiag system which* 



• . 1 ^ 

• ^^IVvl- ^^^^ "^^^^ was .suppbrt'ed in part by the Advanced Research projects 
' \V''' ^^S?^^"'?^- Department of-^ Defense and monitored -under the 



•. i, * Office of 



Office of Naval Research under oWntra(^t N0O014-75-C-1111 



ERIC\ , 



) ' . Page 2- 

. - ' : ' ■ ■ '■ ■ '■( 

uses geheral knowledge about peoples' .goals and plains (WiJ^nsky 
1976). WEIS. .is"^a system which, understands and classifies\a great 
variety isolated newspaper headlines on international 

relations. Thus, oar task was to process not only n6un phrases 
of consideifable complexityy but also to' tinterpret newspaper 
headlines, which ^re not always .gramma^ti bally * correct. The 
following two examples illustrate^ the kind o% sentences pur^ 
system is able to handle, • ' ' 

A CONNECTICUT MAN/ JOHN DOE, AGE 23, OF 342 COLLEGE AVENUE, 
NEW HAVEN WAS, PRONOUNC ED " DEAD AT TiJE SCENE BY DR, DANA 
BLAUCHARD, MEDICAU EXAMINER. ' > * • 

2. FUNERAL OF INDIA' s' SHASTRI ATTENDED.BY USSR, KOSYGIN AND USA 
^ HUMPHREY. . ■ 5/ , . 

- . • ) . ■ • ,\ • . - 'V' . 

. .To process such. a. large s.cope of sentences the program makes " 

extensive .use of • its knowledge of the problem danain and the\. 

redundartcy of natural language expressions.. Thisf- saveSo effort 

and permits 'correct processing of such irregularities of input 

texts as missing cdnmX^and articles', or sfightly incorrect word 

order. I -It also provides for the^ability to ignore Unknown words 

'^or (in some c^ses) to make^ plausible interpretations of lUnknown * 

words, ^ gpbris knowledge Is kept in the dictionary. ^ The (Jbqtrol 

mechanisms remain domain independent, ' . . » / ^ 

NGP is a production-like sy^teA which uses., expectation?, as ^ 

^ts .basic control mechanism, Th'e -problem With every' » 

production-. like system is the tenc3ency for the accumulation of- a 

large nximber. of expectations fighting, tax' a* -ctjance feo.^ be 



tested. Inr this work I have tried to develop "a theory of -"^wow 
various expectatrdns. are organized^ and processed, which, I 
believe^, is in fact a theory of how people process natural 
languai^e. The basi.o guiding principle for this .theory was it^s 
intuitive plausibility. . ' V 



^ 1. Noun Group Semantics 

We differentiate four classes^ of neun groups accprding to 

^ ' ^ ' { 

the conceptual structures they generate • ' * j ^, 

!• - Picture Producers 

♦* • * * 

2^ CTP* - Concept Producers 

to ' * 

* * • ■ 

3. TD - Time Descriptors c ' , - 



4. SD - State Descriptors 



\ 



N 1.1'Pidture Producers 

repp's are defined by Schank (Schank 1975h ^s concepts which. 

tend to produce pictures of re^ world items in/the;mind o£ a 

^ ^ * * /' ' * ^^-^ 

• . ^ ' . '/ *' 

hearer. For example/ " ' 



(1) A BIG RED APPLE - 



is a Picture Producing noun group. Tp understand"* ^uch an Titem 



inderstand • ^uch an \ 



means to identify th6 structure in the memo^_wpich co^n^sponds* 
to this- item if sufch a structure .exists *or \o ^create 'one 
•according^ to sane frame. This ijs d^e in two stages.* In the 
first ^stage/ vfe analyze the input phrase and translate it Into an 



Page 4 



expressi-on in Conceptua-i~ Dependency * (Schank 1972> 1973, 1975)^* 
This expression 'should^ preserv.e in a language independent form 
^ill 'the information contained in the surface^ pArase/ Thus .(l) 
wiir generate ^ ' ' ' 

. . • ■ . . ' ■ ' ■ • 

(#PHVS0BJ type (APPLE) COLOR (»). SIZE (y) DETERM (JNDEF))-, • " 

wbere x -and y aire points on the color and .size scales. In the 

secood stage/* we identify /the- CD expres.sion wit*h the existing 

. ^ %' "» « « • 

memory strji<:tures by performing ' the nec'esrsary memory^ search and 

* '^-.-^ * ^ > ^ ' ' ' > . 

f^aturo, matching.. • ^ • . 

A CD expression for a-*PP consists of & header followed by a 

.property list. The header, is similar to 'a' supe^rset pointer in 
* > - 

hierarchically cwrg^anized memory' systems. It paii)ts to a f rame pf 
jJroperties ' that the PP is expected t6 have. The property list' 
exl>licitly given in the CD c^pr^ssion must, be , cdrapatible with 
this frame. Thus . a (#PERSON), is expected to have FIRSTNAME, 

,LASTNAME, RESIDENCE',, etc 7 ^but a t#PHYSOBJ) is not. Ml 

* * * ' ' * \ * \ 

properties not included in the frame must be speciJ^ied by a REL 

clause. For example, * 

■ ... . 

(2) JOaN DOE/ THE PASSENGER OF THE CAR 
^is represented by' 

(#i>ERaoN fist'name (jo{jn) lastname (DOE) ; ■ • ' 

* • * 

, RBL' ( (•<=>' ($DRIVE PA'SSEMGER MODFOCUS j ) ) ) , 

/ - - , , ' ' ■ . 

<{fhere MODfOCUS is a .back .pointer to ^ the focus " of Ithe REL 
modifier. I.e. to.(#PERSON ...) " • . 



\ ' - ^ - Page 5 

SAM'S memory program accepts 7 general glasses of PP'sr 
#PERSON, #PHYSOBJ, *-#ORGANIZATIQN^. #L0CALE; * fliOAD, #GROUP, and 
♦POLITY, which can be illustrated by the'folowing examples:- 

(3) JOHN =' (IPERSON FIRSTNAME (JOHN)) 

(4) TABLE ; ' s! (IPHYSOBJ TYPE (*TABLE*)) 
.(5) NAVY- ' = (lORGANlZATION branch' (NAVY) ) 

(6) 593 FOXpN RD = (fl^OCALE STREETNUMBER (593) 

' ' • ^ . * 

STREETNAME (FOXON) \.\ 
X . STREETTYPE " (ROAD) ) 



(7) ROUTE 6'9 = (#ROAD ROADNUMBER (69) " 

ROADTYPE (HIGHWAY)) 

(8) JOHN AND MARY = (#GRO0P ' • , 

* . ' ' - 

^ MEMBER (#PERSON FIRSTNAME (JOHN)) . ^ 

. MEMBER (#PERSON FIRSTNAME (MARY))) 

(9) USA ' = ^(tPOLITY TYPE. (COUNTRY) NAME (USA).) 



1.2 Concept producers 



\ Very.oft^n noun groups do not describe any real world items, 

* - • 1/ - 

Consider the following^ sentence: 

(12) JOHN VOTED IN THE 1976 PRESIDENTIAL ELECTION. * 

' . * ^ - 

■ THE 1976 PRESIDENTIAL ELECTION dpes Trot produce a "single' 
"picture" 'in the '^mind Qf the hearer. Ra^ther, it points' to a 
complicated concept involving the . names of the candidates, 
prima|ries, voter xegi strati on >-':ei^-*".<<T45«..JtnJ^Iedg«^t^^^ 
elections is normally organized in a scrip^-lil^ form^ The, verb 
VOTED specifies the role John- played in the election* script. 



Thus, the meanirjig of (12) i3 the invocation of the election 
• 'script* an<3 the instantiation of the script Toles. The CD 

representation of THE 1976 PJIESIDENTIAL ELECTION pcoduced by the 
Sparser lookS; as follow^: 

($B1,ECTI0N TYPE (PRESIDENTIAI) TIME (1976). pEF (DEF) ), . 

Where $ELECTION.v is a script name and TYPE and TIME arej script 



parairteters. This output is '4ri"t^preted by the Script Applie'r. 

■ • ■ • • \ ' 

^« M All script " names and parameters which appeal? iji the CD expression 
•must be recognizable by the Script Applier, . . • 

. X 1.3 Time Des9riptors - . . 



1 



This t^pe of noun group can be illustrated by the following 
exampl^: * . 

(13) LAST YEAR WAS BAD FOR JOHN. ' ' , 

* * 
Sentence (13) means thafc^ something unspecified happened which. 

made John unhappy and that this event (or events) occurred during 

last year. LA'Sti YEAR does "not generate a separate concept - but 

enters as*"* a time modifier into another concept. Other examples 

of Time Descriptors are: YESTERDAY, MONDAY MORNING, TilE. WHOLE 

DAY-, etc. ' ' • 

1.4 Stiate Desci;iptors ' ' . 

Noun 'groups of this class produce assertions about the 
states of PP's. For- example, thejmeaning of , . ^ 



y4. 



THE BEAUTY OF .THE PLACE (St track. John) 



is "THE^ PLACE IS VERY HIGH ON SOME ^ESTHETIC ' SCALE", or, in 
CD form: . . 



• KACTOR (fLOCALE REF (DEF)) IS ( *AESTHETIC-SCALE* VAL (10)1 " 

Phrase (14) .is an assertion of a fact about the place' rather than 
a PP with a modi'fier as in • 



(15) (I saw) A BEAUTIFUL ^LACE, . - . • " 

which can* be represented in CD form as 
(# LOCALE 

re'l. ((actor modfocus is ( *aesthetic-scale* va^, (10)-)),) 
ref'(indef5 ) , 

i.e. .-.a place whkjch is ve^y high on some . aesVhe tic scale. 

, " ■ / ■ - : 

2% Basic ^lfpui> Group Parser y \ 

* * - / * 

.The gpal and the generaT methods of ^the Noun /Group Parser 

4 

(NGP) are identical to the rest of ELI/ i'.e. "the goal of NGP is 
the extraction of the conceptualizations that underlie the input. 
Expectation's are its basic mechanisms^ of operation. (S4^.- 
Riesbeck and Schank i976 ) . I^owever^^he control structure afl^' 
the order in wh|^k the expectations are stored and tested in NGP 
are very dif f eren^^rbm those of .ELI. ^ "To^'^^V^it^ briefly , in- ^I 
all .the expectations are placed in One ^ pool an!a are tested 
whene^^er_^a_J5ew^^^^ concept is"' considered. NGP takes 

advantagejj)f the ^^elativelj^'^^^ iroun- 
^rpups to selec/t an'd order suitabl^e expec^tions *at .each point o£ 
the • process » The program exanHtlies the. words of the- input string 



.from left to right. The basic loop of the/ analyzevr consists of 
two steps: * . . • , 

1. The dictionary definition of the current word is 'loaded- iato 
the active nie'inory. 

2. y Relevant expectationsP a^e* s^lected^ and ^ tested. ' If an 
, .expectation is satisfied, the acti^os associated with it are 

executed. ' * 

This basic loop is similar to the* monitoring; control program of 
ELI o.r ^ny other production-like system. Th^ difference is in 
the selection and ordering of expectations. .This- process is 
rather cdnpldcatfed and I will try to describe it systematically 
and in increasingly greater detail_throughout the rest of the 
' paper. I/Vill begin by presenting the analysis of a simple 
example 

^ (1) LARGE CHINESE RESTAURANT 



firstv NGP sees-^e word LARGE. The dictionary (definition of 
XfkRGE -is a program which can test the environment When LARGE is 
brought into the- active memory artd build the initial SEMANTIC 
NODE for it. These semantic nodes ( called NGP nodes, in ' the 
--^-progTam^-^rer-the-^-ronstrrxfctllin sites where various parts'' ybf .the . 
future CD ^pression are being' assembled . The node for -LARGE, 
say NGPl, has«an expectation attached" to it which says "if . th^ 
next semantic node is an^Jji^jnKftePy ♦then attach modifier 
i-Jp^H£«^-t*T''°T5GPl is saved in a ps^^ck called MODLIST. 



ERIC 



• . . • * . " Page 9 

The word CHINESE builds the semantic' node* NGP2,- whose 
SEMANTIC*' VALUE ' is' ( *CHINA* ) ana which has an expectation saying 

y ■ - ■ 

"if*the next semantic node is a fPHYSOBJ then /attach the modifier 
MADEIN (*C^1INA*) to it, if it is a fPBRSOW or an/#ORGANIZATION 
then attach the m^odifier PARTOF C*eHINA* ) to 'dt". ' Having done 
•this,' the monitor checks' the expectation attached to NGPl. It 
fail^aTi^'NGP2^is placed oo the top. of- ^lODLIST; 



, " Next comes the word RESTAURANT. It builds the. semantic node 
'^NGPS whorse semantic -'value is. ( tORGANIZATION , OCCUPATION 
(RESTAURANT)) and-which has an expeet'a^tioh : "if the 'PREVIOUS 
semantic «*node can be a restaurant type then attach it to the 
cutjr,ent node".- Now the monitor goes wto the expectation . testing 
mode of operation. It sees two sets of - expectations t'^^hose 
attached to NGP2 looking "forward" at NGP3 and those attached to 

r. ^ < 

NGP3 looking " back,ward" ' at NGP2. Expectations ^attached tx> NGPl 
are not considered because NGPl is hidden by • NGP2. ' First,, the 
monitor -tests those .expectations of the current n.Ode which lool? 
"Ijackw^rd" (called BACKWARD in the- prog ram) . If there, are ho 
such expectations, or if all Of them fail, the monitor tests^ the 



1 



forward"* expectations Ccalled .FORWARD in .the program) attached • 

to the previous semantic node; .If an expectation is satisfied, 

the staclc ^is*" popped Jand the process i^ repeated until ^o 

• « 

expectations are satisfied. Intuitively, MdDLIST contains^those 
modifiers which, have not .yet been attached.' The ^ current node, 
which • is kept in NGAP,- is the focus 'of assembling activities ^t 
each step. In our example {*CHINA*) can be a^ restaurant , type, ^ 

4 

the- expectation is satisfied, the' value of NGP3 is modified, and 



li 



•-4 • 



er|c 



Page. 10 



NGt>2 is removed. from MIDLIST. The following diagram^ ' illustrates r 

the transition: ^ . 

. • ' ■* » 

BEFORE: M0DLIST'^'= Nt3P2, NGPi " ' ' ' 

• ' ' ' ' 

NGAP = NGP3 . ^ * 




NGP3 = *{#ORGANIZATiqN. OCCUPATION -(RESTAURANT)) 
'AFTE'jir-v MODLIST = NGpl • " ' • ' ' _ . ' 

. ' NGAP = NGP3 - . ' * " ■ ■ 

NGP3 = (#0RG4NIZATI0N^ OCCUPATION. (R^j:fftdRANT) 

' \, ' TYPE (*CHINA*) ) / ^ - - 

Now the monitor sees NGPl on the top of the stack. , Since NGP3- 
;does not have . any BACKWARD expectations' left the. ° .FORWARD 
expectation of .NGPl is^.tested-. Note that' at .this point, 'NGP3 
does not correspond to any particular^ word , but represents , the 
. ccratined meaning of CHINESE RESTAURANT." LARGE can be\-^ttached to 
NGP3 and the resulting^ structure^ is : 

MODLIST, = EMPTY ' ^ ' ' • * 

NGAP = NGP3 ' .' * ' ' ' , * ' ' ' • ' 

^ • : . - ^ 

NGP3 =" (,#ORGANIZATION OCCUPATION (RESTAURANT) 

. TYPE (*CHINA*) . . " , 

-SIZE (X) ) . • . ' ■ 



So far, we have ^introduced the. following C9ncepts: 
SEMANTJC NODES - are"* the nuclei around whi«ch all construction 
^ activities are done. Thg 'value' of a semanti-c node is .a 
piece. Of conceptual ■ structiure . which might be used in 
assembling the CD^^gr^ession' for the whole noun group.. 
BACKWARD and -PORWARD^ 7. ar.a the two groOp^s "of e;^pectations ' 

attached to a semantic no^e. " ' ^ ■ . . ' 

*. . ■ . i , , , 

, '"12'' " ^ - ' . • *• 



Page .11 
« # 



nodes, 
inf ornJally 



NGAP holds the current semantic node, * 
MODLIST • is a stack which holds all previous semantic 
The " basic control algorithm '^of NGP, Which was 
described with the help of the above example/ now can be' stated 
in m^re precise, terms: 

STEPl Read /new word. Exfecute its definition an^' put , the 

resulting semantic node in NGAP. ^ 

STEP2 rf MODLIST 'is empty then go to StEP7 else go 't<i STETPsI 
1 • . . ' J 

STEP3 If NGAP does not have ,any BACKWARD expectations go bo 

STEPS, otherwise go -to STEP4.^ 
STEP4^ Evaluate BACKWARD expectation's of NGAP. Tn 'hase of failure' 

go to* STEPS, otherwise pop the stack and go/to lSTEP2. 
STEPS If th(B semantic node on the top of MpDLISy does, not have 

aoy - FORWARD expectations then go to STe/?) otherwise g'o to 
. STEP6.' • • 

STEP6 Evaluate 'forward expectations.' In case of failure go to 
' .STEP7, otherwise pop the stack and gp to STEP2. . ^ 
STEP7 Pa€ the content of ; NGAP (current sero'an tic. node) on-, H0DLIST 

and go to STEPl. ' - ' , . 

The underlying assumptions of this- algorithm 'are: . ' 

(a) People r.e'ad noun groups from left to right ^ 

(b) . People do not passively -accumulate words until they decide 

that 'they have reached thfe ^liead noun. Instead, they make 

decisions about the^ interpretations and combinations of words 

\ / as soon as it becomes possible -(i.e. as soon as an 

expectation is satisfied) . Thus,, • in a •pht?ase^ ' MEAT SHOP 

' ' ' / ^ 

^ OWNt;R, MEAT SHOP is interpreted before OWNER is 'read. 



• • ■ ' : • ' ' i»age 12 

(c) Expectatidns attached to words which come later in the phrase 

^ usually are s'tronger than those of preceding words. In the 

I* 

, sequence of words of a simplg noun group (like FEARLESS 
* * _j - * • . . 

CHINESE SOLDIER) words on the. left ar^ usually modifiers of 

.some wory. on the right. ' A. modifier normally has ' PORWARD 

expectations for a fairly large' class of items it can modify. 

On the 'other hand,, it is relatively seldom that a 'word is 

looking for a particular .modifier 'on its left. In general, 

the more specific, the e^cpectation is, the higher priority it 

should ^haye* ^This is what. Ijappened in our example with 

CHI'NESE' RESTAURANT. * ^ * . ' • 

4 ' \ . 

So4^ar, I have -carefully avoided one very important problem. 

My basic, control algo^rithm ^does not hav.e a STOP statement. Where 

does/a noun group end? This problem is discussed in the next 

.... - f ' 

section. • ^ - 

• * ■ ' ' ^ \ " ^ 

3. *The Problem of Boundaries 



'One problem that any noun group processor has to solve i's 
i - ^ • ' ' 

the problem^ of boundaries. Where* does a noun group end?* In most 

c^ses the answe'r to this question i6 quite simple: . tbings^ like 
verbs, commas, prepositions, and articles terminate most' noun 
groups. In. practice, hovigyer, none of th?«e indicators is very 
reliable.. Consider; the following example that NGP ha^' to deal 
. with: . • , ' 

* * 

(l)'-THE U.S. FORCES FIGHT IN VIETNAM IS HOPELESS. 

* • * 

Er!c 14 ', ^ 



. ' • : . . . " Page 13 

; This example illustrates the' difficulties arising from the 
ambiguity' of • the part of speech classi^fi cation of the vfords 
FORCES and FIGfi*?. When the contejft does not provide an ■*ear;y 
disambiguation we' have t6 raake^a guess and then later correct it 
if .necessary. As a first guess, NGP collects "the maximum number 
of elements into a n'oun groiip. Thus it includes both FORCES and 

. ' r • ^ — . 

' ' PI^HT rather than stoppi#5 after THE U.S. 



i 



{2) BILL, JOHN, AND MARY LEFT, ' " • - 

♦ 

(3) BILL KICKED JOHN/ AND MARY KICKED BILL, , . . 

* * ** * • 

. BILL/ JOHN, AND MARY in the second exami)le constitute^^one 
semantic unit ^ ^ ■ - s 

(^GROpP MEMBER-* (f PERSON FIRSTNAME (BILL)) 

MEMBER (JPERSON FIRSTNAME (JOHN)) ^ 

MEMBER (#f>ERsbN I'IRSTNAME (.MARY))) '^^ - - 

But IS It reasonable to consider this phrase as a single noun 

.group x>n. the surface level? Example (3) shows that JOHN, AND 
» ♦ * * » • 

MARY might be different g roups ♦ Expectation external to the noun 
. group, mxjst decide whether these three words can be clustered ih* 
^ .one group^ The same is true for examples (4) and (5)., where the 
phrase ON THE TRAY may or may not be attached to ti^e; noun phrase 
THE GLAgS, • " f ' 

(4) JOrfN SAW THE GLAS^ ON-T«B TRAY; ! * 

(5) JOHN PUT THfi GLASS ON THE TRAY/' . ^* , . ^ 

dh the o^ther hand, the preposition OF in the phrase OF STATE * in 



T 

Page« 14 

m 

r 

example (6) " * * • 

* ** * 

(6) U.S. ASSISTANT -SECRETARY OF SO'ATE MARSHAIiL GREEN 

is'predicted by the noun SECRETARY, and can be interpreted by the 
noun group processor without outside help. This brings in the 
following ^principle of noun; group processing: 

ANY UNEXPECTED WORD' WHICH IS INCOMPATIBLr WITH THE CURRENT 
. NOUN .GROU?'TERM.INATES^ THE GROUP ON THE PRECEDING WORD. 
Control is- rpt-Orned to 'the .;hi^er level routine which called the 
noun group and which decides how the group will be used. > It 
might be attached to a preceding noun group or used otherwise. 

• . . " -I' 

Seraanfeically , a phrase like // 

c 

(7) A RECENT YALE GRADUATE, JIM HEEHAN, til', ASSISTANT PROFESSOR 
' OP COMPUTER SCIENCE AT UCI (was awarded ...) 



> 



■ is one ^P and, therefore, should fc^ ccjlsidered one noun, g'roup. 
/ processing ^point of viev/, we need a more restricted 

definition of SURFACE noun groups." A SURFACE NOUN GROUP (or, 
simply, noun group) is a string of/ words which can be processed " 
by NGP without relinquishing control to the higher processor. 

What are the rules of compatibility which determine ;j^he 
boundaries of a surface noun group? -All semantic nodes that can ' 
be- used ii^ a noun grbCTp must belong to one ' of th"e following 
classes: ADJECTIVE, ADVERB, NOUN, T|TLE, NAME, NUMBER, .DETfeRM,' • 
and BOGUS*. (This information is stored on the node under the 
property" MARKER). Class BOGUS is,reservea for unknown words, and' 



V 



Er|c • ' 16 



will be disci; 



Page 15 



ssed -later. Class TITLE . contains all the words 



whiph can be. followed ^by a name; professor, doctor, patrolman, 
president, etc, ^The houn group is processed from le'ft tp right 



•following 



long as the -following conditiQns are satisfied: 
(X) Each word which is not specifically expected must belong^ t.o 

one oF^the classes mentioned above, 
(2> No word can pfecedQ, a DEI^ERM, 

(3) ^ADJECTIVES, ADVERBS, and NUMBERS cannot be preceded by either 

NOUNS, titles; or NAMES, . ^ 
(40 TITLES andHOUNS cannot be preceded by a NAME, 

(5) A NAME cannot he irairtediately preceded by a l^UN, 

(6) NAME cannot be pr>eceded by a DETBRH,^ 

For "Example, phrase'^) 'will be processed as four separate noun 
groups: . - ^ ^ - — 

(a) A.' RECENT YALE GRADUA-p^--- ends with a comma,, but eveiii if this 
coraiia were missing, the phrase would have ^ beep terminated at 
the Same place by NA/'lE, us^ing rui«s45 and 6 

(b) JIM MEEHA» - ends with a comma . * * 

( c) 27 - special' case of a noun grpup - an age group *^ 
•(d) 'ASSISTANT PROFESSOR OF COMPUTER SCIENCE AT UCI ' - ends with 

WAS whi(^ is a verb ^ . ' " - > 

Noun gtoups OP COMPUTER SCIENCE and-AT UCI are processed ; without 
leaving NGP since the word PROFESSOR sets up expectations for* 
them, . ^ \y ' 

Rules- (1) - (6) are much looser than the usual syntactic 
rul^s for noun groups (see, for example, Winograd 1972), But our, 
goal is not the rejectrion of syntactically incorrect, sentences. 



. ' . ; ' Page 16 



a « 



We^ introduce restrictions only where ' fehey- '*h,el.p, where their 
absence creates disambiguation Or processing difficulties. 

The oth^r distiriditive feature^of Qutf rules is that-tb€y> are* 
generatecj dynamically and ' can^ be -^changed . by auctions of any 
exRectatidn. This is "how, for example, possesives are handled: 



. (8) POLICE CHIEF '3 NEW -CAR^ .. . * ' 

Fi>st. the nod<s for POLICE CHIEF is build: 
■ NGPl : ■ " • 1 ■ 

VALUE = (IPERSof. 'OCCUPATION (-POLICE-CHIEF)'^ ■ 
' MARKER-,^ TITLE' ' • ' "• \ ^ 



Then- the program see.s-< the .'possession mark which satisfies a 
special default' "expectation. , The action of this expectation 



transforms NGPl into: 



NGPl: . \ , ■ c 

VALl^: = (fPER^N OCCUPATION (POLICE-CHIEF) y - . — 

MARK5JI =^ ADJECTIVE . . " . ' . -s 

' . ' * . . ' ' ' ' 

FORWARD = ."If the 'next node is ^ #phys6b^ then' make it -POSSB.Y 
the value of NGPl^Vi.e. by (fPERSON OCCUPATION 

~ . (POLICErCHIEF)Y)'" ■ 'V^ » • 

■■ ' ■ ' ' ■ ■■ ^ • , - ■ 

' . - -!r-f v . . 

4, jPutting pieces Together, 

In the previou$ sec ti.OK I described the basic \noun qroup 
processor. Cojlplex Y^oun groups ar^^^Jcen into simpler .phrases 
.which are processed separately. Sepa.rately, however, does not 
mean independently. The previously built part of the noun. group 



Page 17 
^ « 

can affect the analysis/of the remaining' parts. In this .section 
I will .describe the mechanism of this' interactibn and how various 
parts of a. noun group are put -together . 



In accordance with our general principles, this process is 

. ' ' /'' ' 

driven by a hiera^rchically organ^ized set of expectations. Th^re 

are two kinds of expectations: .(1) those dynamically generated 

by the input and t2) default ^expectations supplied by. the control 

mechanism. Thfese default expectations are <3esigned to catch such 

unexpected .things as appositivei^, /addresses, age groups, etc. 

For example, when ve hear A' CONNECTICUT MAN in • 

(1) (The 0ward .was^given to) A CONNEiCTICUT MAN/ JOHN DOE, 

AGE 23, OF '234 COLLEGE AVENUE, NEW* HAVEN. . . 

we do not necessarj-ly immediately expect, to hear his name, age, 
and address,' althou^' we' know that as ^^person he has these^ 
characteristics.^ These are secondary, default expectations which 
are tested onlyy^ if other, explicit expectations fall. In the 
above example tl\e processing goes as -fgAloWs: * ^ 

•First, A CONNECTICUT K^AN is collected, generating r^;^,. 

(2) (IPERSON GENDER (MALE) • - • 

RESIDENCE (#LOCALE STATE (*CONN*))) 

At this point', control teturns to ELI which tests. the 
expectations wlj^ich" were pending before we- reached this phrase. 
One of these expectati<^s is .satisfied and its atotion puts 
structure (2) 4pto the wai-ting slot in a larger frame:* \ 

1 

■ . . ■ . • ■A. 
19 

I 

U 



. -' Page 18- 

.. ^' ° ' ■ M 

( (ACTOR -(NILj, <=>. (*ATRANS*) .OBJECT (*AWARD*) ■ ' ' . 

TO^UPERisbN GENDER (M^^ ■ ' 

* ■ "• ' ' - * 

• ' ' ' -RESiDENCE (#LOCALE STATE (*COI^N*))) 

. • ■" ' ■ • 

.'^ " 

Tl^e.slot that ,(2).' filled is remembered in the variable called 
LASTi5G.'* ; Then com'es JOHN DOE. ' no explicit expectatior^ aire 
'^ati^sfied. The monitor .goes to a special 'mode called TRAP; ffRAF 
checks- whether LASTNG was a person and., if so* checks the default 
expectations about a person. The NAME' expectation is satisfied 
and the - specialized- action which collects personal names j.s 
executedv As a result name modifiers are attached "to the male^ \ ' 
Connecticut', resident : 




(#PE$SON -GENDER- (MALE) - » 

RE&IDENCE, {^LOCALE STATE (.fiCONN* ) ) 
FIRSTNAME, (JOHN) LASTNAME_XDOEJJ. 



After this, cdntroj.goes back to the top level processor, 'This 
reads the next Word, "27". Again, no expectations are 
immediately satisfied and the monitor traps- into the secondary 
expectations. The AGE expectation is • satisfied and • the 
specialized action which collects AGE specification groups "is 
executed . , The result is an AGE mo^ilCier which is attached to 
John. OF 234'"Cp;:.LE.GE AVENUE also goes^o TRAP, * which"^ calls the 
address group processor. The' final result is: ' - , 

(IPERSON GENDER. (>1 ALE) 



RESliENGE (#LOCALE. STATE (*CONN*) 



STREETNUMBER (234) 



r-' * ' Page '19 



STREETNAME ( COLLEGE ' AVENUE ) )'^- 



FIRSTNAME (JOHN) LASTNAME (DOE)) 



4 



'2 The following example, illustrates 'a slightly different 
problem: ... 

.(3) l6uIS ^APPIELLO, YALE POLICE CHIEF . 

♦ <" ' ' * . ^ 

In order, to .figure out that being a YALE POLICE ^tHIEF is' LOUIS 
C|(ppiVllO's occupatipn/^e first have- to collect' both- noun ^r^jz^ps • 
This is dc5he with the help of another secondary expectation 
called EXTRA-N06ngr trap. LOUIS "(lAPPIf;LLO,\eaerates : 

' r- ~ " * ^ - • X * 



T 

YALE POLICE CHIEF generates: 



{#PBR%ON FIRSTNAI4E (L^IS)^ LASTNAME-teftPPIELLO) ) 



(♦PJ:RS0S OC^DPAJIOn' (YALE-POLI CE-CHIEF)') v, ^ > - 4^ ^ 



Then another secondary expectation tests to see if LASTNG*,^^ 1Smd"'^ 
EXTRANG could be / the sam6 thing. If the two^ groups^ are* 
merged.. . " ^ - > . / ' 

App9sit'iVes can be arbitrarily complex^- • from simple name 

groups to compl;icated .prepositional phrases and relative, clauses".-^ 
« •> 

Very rarely are they "explicitly "expected . They aire ^handled, by 
the secondary expectations based on the general properties'^bf 
things and. the knowledge ajjout the ways these- things can be 
expressed in English.. TRAP represents an -^ttempt to implement 
the jpnechanism controlling the - interaction • ' betweeli these 
^expectation?. ' * . -.. ■ a " .4 ' 



2h 



. TRAP is still in the exE>eriinVntal stage of development. Its 
flow of ' control is rathe^ comple^i. In general, fipst, it tries 
to fipd!^ and 'test expe^ctations albout generaf ^ properties' of the 
itpm in LAJTNG., ^ For example, for a person, it fcries to-^collect 

special modifiers such as name, age, ^gnd address, .-If _ali these 

^ ^ . ' ' * , '* fi ^ " , 

expectations ' fail-, TRAP checks for possible' appositi\^es such ae 

simple EXTRA noun groups, prepositional phxasesl?> o?'^ relative* 

subclauses. If one of these appositiVes is /cbir^Qted, TRAP first 

checks the explicit expectations which may have been pending (for 

.example, a WHICH-clause might want ^tb be attached to a particular 

f ., ^ ' '/ 

hysical objec^:) and Nrhen checks the secondar^y e^fpectations 

a^m-. This time, it may catch some properties \hich it missed. 

th§T,^irst time because they were encoded In a^ pore c<bifiplicated 

form. In ..order to clarify this/Qesctiption let us'^follow a'few 

' . ^ ^ 

more examples; * ' ^ - 



"7^ 



(3) JQHl^-^EfBE" OF GENSllAL MOTORS ^ " \ - 

- :"r • , ' : 

The sy>group OF GENERAD- MOTORS is qaught by Tto's 'prepositional 
phraafe 4^ectation.- Since there, are no specif ic,!ejcpe.ctations 
vmch can link JOHN. DOE and GENERAL MOTORS', \he' default one, ^ 
-attached to OF i,s checked. Its action links. the two groups ?s 
follows : 



(IPERSGN FIRsWme. (JOHN) LASTNA&E (DOE) ° 

SOMEREL (lORGANIZATION OI^GNAME ( GEMe'rAL-'MOTORS ) ) ) ' 

SOMEREL. means that we do not really know t|ie^ exac€ nature of the 
relation^ between JOHN> DOE and (GENERAL MOTORS. -. ' • ' 



ERIC 



> Page 21 

In the following example 



(4) US NAVY TASK FORCE WHICH HAS 3BEN ON .PATROL ^UTy IN THE 
INDIAN OCEAN (left .the area). ' - • , ' 

th^ WHICH clause is collected by^f RAP' s' subclause expectation apd 
is^ attached to US NAVY. TASK FORCE by an expiecta*tio'n associated 
■ with WHICH.- The result is: . 

\ ' . r ' 

.(#GR-O^G. PARTOF (fOFfcANIZATION* BRANCH (flAVY) ^ ■ 

PARTOF ( *USA* )l 

R&^( (ACTOR -^I'dDF^C US ^ , . . ' ' 

y * ^ . . * . 

<=> ($ PATROL PLACE ( *INDIAN-OCEAN*) ) ) ) )''~^' 

■ i • - \ .• 

Subclause processing represents a difficult -problem on its 
own. The problem; of subclause boundaries,' for example, is as 
complex as that of noun groupsT^- In^ solving itt^ I used ^.itl 
philoso^)hy as for noun groups boundaries: the current subclause 
is finished when the next word is not expected* by any 
expectations fr.om that &ubclaus^. / - * ' 

The traditional stumbling, block of all parsers AND 
conjunction - is also handled by a series . of ^TRAP expectations'. 
Although, in difficult cases we cannot avoid backtracking, simple 
cases like - . . 

., - - ■ y 

(5) JOHN AND MARY ATE ^OUP AND LASAGNA^AND LEFT. 



can be p^cessed by the prog.ram • wi th the help of the following 

* ♦ ♦ . 

heuristics. if AND is n6t specifically expected and occurs in 

%, • * • ^ 

the sentence between two^noun groups^ which can be combined -in one 

a. . ' 



V 

2S ■■ 



\ * 



Page 22 



semantic unit then it is interpreted as a link between the two 
noun groups. Ot"^erwise/if AND occurs i;i the sentence after the 
verb it is interpreted as a link between two clauses^^ 

. ■ •■ ■ ■; . ■ ^ ■ . ■ 

All -examples presented so far deal ,with' noyn groups 
' describing Picture Producers. The ne*t/ example shows how Concept^ 
Pfodubers are handled. 

\)(6) XCiastro condemned) THE EXECUTION* OF ' THOUSANDS OF COMMUNISTS 

IN INDONESIA. ' , ^ 

%. . ' * 

\ - ■ = " 

THE EXECUTION refers to the script $EXECUTION. This' script has 

■ among- its roles the VICTIM of the ^execution. ^ Among 'the' 
expectations associated with the script there is one which, 
expects the ^" victim to be a person (or .a -group of people) 
introduced by the preposition OF. Hearing . the word ~ EXECUTION 
sets up an expectiation for the word OF (someone] . THOUSANDS^F 
is another unit; which creates a group whose members follow. This 
ejcpectation is satisfied by COMMUNISTS. When IN INDONESIA comes. 

,^it is not expected by anybody. Hence, the riov^fv group collection 
is suspended' and THE EXECUTION .which is now transformed into:. * 

• * ' . * 

[$EXECUTI0N. VICTIM (#GROUP MEMBER (fPERSOU . , 

„ . • OCCUPATION (COMMUNIST) , '' 

. •. COMPNUM ORDER VAL (1000] 

is placed in the MOBJECT slpt .df MTRANS for "condemned". 'After ^ 

■ this, IN INDONESIA is collected:' ' ■ ^ 

'.(LOC VAL (*INSIDE* PARTOF ( ^INDONESIA) ) ) 



ERIC . . - 24 



A 



Page 23 



NOW the ^processor must' decide whether Indonesia, w^s 1:he place 

ji 

^ where the execution occured or where it was condemned by Castro. ^ 
In the absence of other expectat^ions, the pr.og ram picks the first 
alternative. . ' ^ *. ' ... 

To conclude this section,. I wodld* like to discuss the^ 

treatment ^f^words unknown to^ the program^.. People liave a ^imited 

^ ^ -al)ility to interpret such words ^cm context, or, ^ at ^east, to* 

ignore them^ We*tried to put sope of this kind 'of ^in-^el\igerrce 
^ / ' ' ' ' ^ : 

, in our programs. The problems-has two aspects.- First, w'e have to 

^_figure^-^ ocit what role th^ unJoi^wn word, for wprds.) might play in 
\ f f * 

the sentende and. then interrogate , the context to ' find out' what 

meaning this .word mignt have. The bordetli,iie between these two 



tasks is very vague. As of noy, ' most of the first parf is 
handled by NGP and most of the second part by Rick Granger's 
program called ^FOUL-UP (Granger 1977). The following examples 
^illustrate hdW ti\e NGP part works. • 

(7) JOHN ATE A FOO FISH. A\ ^ ^ ^ 

FOG jLs interpreted as an unknown, modifier and ignored .' 

^(8) JOHN ATE A BLU? FOG. 

The output of NGP' ^ < ^ 

: / 

(#BOGUS COLOR (BLUE) LIIXVAL (FOO) REF (INDEF)) 
is handed to FOUL-UP fpr furtjier jinvesti^ationT 
(9). DR'FQO BAZ' ATE A BLUE FISH. 



' ♦ . - . • ■ / ' 

ErJc , ' . » 25 



' / ' ' Page 24^ 

. FOO'BAZ are interpreted as the 'first and the last • names of a 

••"t. ■ ■ • • ' • • •■' . 

person whqse occupation is DOCTOR. - : 

. (10) FOO'S FISH WAS BAD. ' * - •* * ' - 

a • • ^ . 

FOO IS interpreted as the last name unless 'j[9)L and (lOJ o<*2Jurred 
in /^he same story, in which'case POO would have cU. ready been 
defined as a first name. V- * i ' V 

(-11) JOHN WAS TAKEN TO THE HOSPITAC BY FOO AMBULANCE. =• / ' 

' • . ■ ' . . ' ;' ^ 

FOO , is interpreted to be a' name of an ambulance company, since 
*AMBULA"NCE has a BACKWARD expectation ^looking for a. company name. 



# > 



' (12) 593 FOO BAZ .AVENUE 
FOO BAZ is interpreted' as the faille of an avenuev 



5. Memory.and Language Processing/ 



In this^section we would like/ to. ' discuses some general 
** ' . *' . » * - 

problems of mertory and understanding 'as related to one very. 

practical taSk. -Originally the idea t;o .write a noun grpup parser 

appeared in cbnnectfon with^-our preliminary work »on the^^WEIS 

project.— As mentioned in the introductibn/ *WEIS'' a system 

designed to un<3erstand an^ classify isolated newspaper headlines ^ 

on intetnational ^relatioiris. t The classifications, of headJlnes 

. ♦ > / *^ 

about international interactions are triples^. ACTOR,.(b^ntry) . 

ACTION (one of 20 selected international interactions),' and 

TA^RGET (country)^. The list of 50 headlines that our system^. cari^ 



. f Page '25 

handle (as of March .1, 1977) is giv^n in Appendix^l;* 

An earlier attempt to obtain such a classification directly 
»from the input t^xt using the "simplest possible syntax relative 
to the' ACTOR-ACTI.ON-TARGET semantics" failed dramatically 
(Tripodes et. al. \l974)* It waVvclear to us fron the .beginning 
tha.t in order to correctly encode a sentence one has- to 
understand it* Further,; one cannot speak' ab^ut understanding 
withou.t meaning representation and memory models, '^Conceptual 
Dependency was our natural choice for a meaning representation 
system. As for the memory, we thought that a very limited model 
containing only basic information about ^countries, people, 
\physical objects, an^ some ' organizations wou-ld be sufficient, .for 
the . task. This model* proved to be inadequate. To determine the 

f " , 

* meaning of even simple sentences we *need much more ^detailed 
knowledge . about current, and past relations between' countries , 
tHeir sisve, policies, and many more other features. Consider the 
following examples: ' ^ 

(1) LEBANESE OFFICIALS SEIZED 1500 RIFLES- FROM "BULGARIA (*) 



Were the rifles owned by Bulgaria, m^de in Bulgaria, or did they 
cone frdm "Bulgaria? * A reader Vho follows international relations 
wduld know that it - is highly implausible that the Lebanese 

(*)Som'e of the examples in this section might look cumbersome or 
artificial, but, in fact, al^l'.of them are real newspaper 
headlines which WEIS had t6 process and classify. 



Page 26 

> • * 
officials would enter into direct conflict with Bulg^ia by 

seizing its property. An informed reader woul^ .also know that ^ 

there are $rraed groups in Lebanon who receive supplies from ' 

Comnuinist countries* alius, he .would conclude that the rifles 

probably caiae from Bulgaria and were ""seized from' an unknown 

party. The meaning of (1) can be expressed in CD using script 

ndtation, as follows: 

(ACTOR (#SR-ORG MEMBER 

{# PERSON PARTOP 

(# ORGANIZATION TYPE (GOVERNMENT) ^ 
PARTOP (LEBANON)))) 

■ < = > ($SEIZE) 

FROM X " ' • , 

(N • • • 

•OBJECT (tGROUP MEMBER Y) ) ' 
where Y is 

' (IPHYSOBJ TYPE (WEAPON) COMPNUM (1500) 

REL (ACTOR (SOMEONE) <=>"(ATRANS) OBJECT Y 

- ' FROM ({BULGARIA) TO* 

If, on the other- hand, t,he headline had been ISRAEL SEIZED RIFLES 
'FROH EGYPT, with the two countries engaged in a. direct conflict, 
then a knowledgeable' reader wotild have probably Concluded that* 
the rifles were seized fron Egypt. Here * different 
interpretations lead to different WBIS encodings wfth different 
TARGETS for the ACTIONS. ' 



ERIC 



28 . 



Page 27 



Th.e difficulty in the above examples cones from the 
ambiguity of the word PROM. It can be a link between the verb 
SEIZE and its indireckt: object or it can link a qualifier to a 
noun group. 'in- general, prepositions help us to id^htify the 
roles of played by the words they precede, but very often they 
are aot sufficiqjit- Consider the preposition- BY .in the following 
sentence: i 

\ • - 

(2) USA PROTESTS INDIA 5 S ABANDONMENT OF NEUTRALITY BY 
ESTABLISHING FULL DIPLOMATIC RELATIONS WITH NORTH VIETNAM 

Even after we have established that BY introduces the. instrument 
of ^n- action (which in itself is a nontrivial task), we still do 
not know which action this instrument modifies. Who established 
full diplomatic • relationsj^th North Vietnam, the USA:* or India? 
One has td be acquainted with^ the corresponding political 
situation in order to reject the first interpr:etation by making 
the inference that. the USA was not likely to establish full 
diplomatic .relations with North Vietnam, but India was and iSuch 
an act would, in fact, be a violation of neutrality from the DS 
viewpoint. " . 

The preposition IN. is even more troublesome: 

(3) CASTRO CONDEMNED THE EXECUTION OF COMMUNISTS IN- INDONESIA 

Here again an informed reader would know that ^astro was not 
likely to pronounce his cg)ndemnations in Indc^nesia^ and hence , we 
conclude that it^*was the execution of communists which j^ook place 
in that country.' \. ' ' ^* 



: ^ . Page '28 

Another difficulty is the scope , of the prepositions. 
Consider: 



(4) SOUTH KOREA SMASHES 7 WORTH KOREAN ESPIONAGE RINGS INVOLVING 
9 SPIES AND 14 -COLLABORATORS UN SEOUL, TEAGU, AND POHANG . 

For some reason we merge 9 spie.<^ahd 14 collaborators in a group 

of 23 individuals, which is split into 7 groups that are 

^ * % * 

d}/stributed in three South Korean cities. If, instead of 9 SPIES- 
we had 9 COMMUNICATION SATELLITES, .T^en we would have placjed only 
the 14 collaborators in these cities, k^eeping the location of the 
satellites unspecified. 

.« ^ 

Semantic ambiguity does not have to be related 'to any 
particular preposition. Consider -th^ following example: . ' 

(5) CAMBODIA HOSTS USA ASSISTANT SES^TARY OF "STATE FOR 
BRIEFING ON THE? OUTCOME OF US' PRES-IoiNT NIXON VISIT TO CHINA 



Who was briefing - whom*? Our >k|iowledge of the international 
situation at - the time of Nixon^ s -first visit to China tells us 
-that it was the USA who was briefing Cambodia, . ^ • 




vbuc 



Even* a relatively simple noun .phrase -feuc^ as RUSSIAN RADAR 
INSTALLATION in ' • . 

(6) ISRAEL SEIZES A RUSSIAN RADAR INSTALLATION IN EGYPT 

can be a source of a mistake in encoding. Only the knowledge, of 
the precise _ nature of the relations between Israel, Egypt, and 
the USSR allows the rea'der to concJude that the radar in question 



'e . /^age 29 

was made in rather than possessed by tiie USSR, and that the 
TARGET of the Israeli ACTION was "Egypt rather thaji^-'lSIS^ia. 

- ' ^ . r 

The correct understanding and classification of the above 
examples ' requires very- detailed Icnowledge ,ot internatiorral 



/ 



relations. And these were rather simple sentepices whose meaning 

/ . . 

seems obvious to most people. Many real newspaper headlines are 

« * 

much ihore puzzling: 



(7) JORDAN SAID THE ARABS FAILED THE TEST 

r 

(8) FORD TO NEW YORK: DROP D&AD 



v\ 



Suppose now that we have a detailed model of ■ the political 
world which enables us to make all necessary inferences about 
international affairs. Will such a model be sufficient for' the 
correct undersbanding of political headlines? S On the surface the 
answer is yes. With- such ^ model we would be able to make 'all 

* / i V 

the inferences we needed/ in 6ur analysis of theJexamples in this 

^ection.' But note that ifn our discussion of th^se examples we 

-^only listed the ngcessaty inferences,.' We said nothing aboiit how 

/ * 

we arrived at the necessity to use^ these particular irifereVices. 

In other words, the memory itself is not. enough. We need to know 

\ j • 
how to get the parse t do. ask the memory tj;ie right questions. 

This paper descr-ibes | in detail how such questions are treated 

irtside English noun groups. Most of, the examples in thi? sectioa 

go beyond the noun group framework. We wejre able to" handle them 

by^a^fc^a^THj^ the ad hoo requests jto the secondary expectations 

deefining instrumental and, locative prepositions. This is not 



31 



, • . ^ • / Page 30 

xaiway.^ a satisfactory solution, and finding a general solution to 
this ,^roblem is one of the<.>areas of our current research. 



6. Comparison with other Work and Conclusions 



- The- work presented ^in th^s^paper is a further d^evelopjient of' 
ELI. .The main difference between this program and most other 
--parsers, (see, for example, Winograd 1972, Woods and K^lan .1971) 
is *that -it does not separate its linguistic. knowledge from its 
general world knowledge. In other programs t^ae analysis Is done 
in two stages.' first the input is analyzed, syntactically ai^d 
tjen the result is in€er£S-reted« semantically . Foi; example, LUNAR 
(Woods and iCaplan 1971) uses the Augmented Transition Network 
Qraramar .(Woods'" 1970) to generate' possible syntactic' 
interpretations of a given sentence 4nd 'then "applies its domain 
knowledge to determine whether the interpretation is meaningful. 
Thus/ noun groups are parsed purely syn?actically and their 

^ • 

meaning is not established until. the whole, sentence is. parsed. 
\In each noun group the first .noun ; is assumed to be the head noun-. 
If later this'turns out to be incorrect, the system "backs up and 
.tries to accCimulate more elements into the ;ioun group. For 
example, the (porrect processing of. the phrase PRESIDENT JIMMY**^ 
CARTER^ which contains three nouns will require JLUNAft to back up 
twice. This means that a great deal of unnecessary effort Is 
spent m finding syntactically plausible but meaningless parses. 
This is especially trUe when one tries to relax, some^. syntactic 
rules to allow' for- i slightly incorrectg^sentendes . In NGP the' 
parsing is done by the use of rules most appropriate in a given 



vaege 



14 



situation, semantic or syntactic; Thus, in the example above, 
the' programs contained in ^ the dictionary entry •for the , word 
PRESIDENT will ^immediately « collect JIMMY CARTER, Most pf the 
program's linguistic knowledge is not built jinto its control 
structure but stbred in the dicti9naries and used as a parij^r^of^ 
its gene^ral knowledge. This makes the program very flexible,' 
easily extensible, and provides for the correct processing of 
•^lingrammatical" sentences, * . * ' 

Another impo.^jfnt difference between this program and both 
Winograd/s and, the LUNAR system is in the representation of 
meaning. The meaning of a sentence^ in Winograd's system is a 
program for manipulating blocks. The meaning of a sentence in 
the LUNAR ^^systera is .a request for information about some 
properties of ^the rocJcs from the Moon, Both these systems are- 
very specialized and not easily extensi-ble to other domains; Ou.r 
analyzer is based on. the Conceptual Dependency representation" 
systeni which is not limited to any particular domain. The same 
program can handle a., v/ide variety of topics, f ran -car accident 
reports, to state visits to China, 

The results presented in this paper show tliat both 

/ linguistic^ and world knowledge are required for correct and 

> 

efficient handling of noun groups. The program demonstrates the 



possibility and the advantages of the simultaneous ^pplication^^ 
both kinds* of knowledge, without separating the process of 
understandin^^'^nto syntactic and s^mantig stages. The program 
providers an intuitively plausible model for a hieracchically 



33 



^ * ' ' .* . \ Page 33 ' 

* " 7i References ^ - ^ 

< • -. ■ 

1]. Ctillingford, R.E. (1975) • An Approach to t-he Representation 
of Mundane World Knowledge: The Generation and Management .of 
Situational Scripts. American Journal of. * Comput actional 
Linguistics. Microfiche 44. / . ^ . ^ 

* / 

2] Cullingford, R,E. (1977). Organiising 'World^ Knowledge "^^for/ 
Story 'Understanding , by Computer. Unpublished Doctoral! 
Dissertation. Department of Engineering .and Applied Science, \ 
Yale University. ' » _ ^ - ' 

• ft - ^ . 

31 Granger, RVH. (1977). FOUL-UJ>. Paper to* be '^^resented at 
the, Fifth ^^International Joint Conference on Artificial 
Intelligence. Cambridge, Massachus.etjis. 

4] Hewell, A. ^(1973). Product^effsys terns : Models * of^ Control 
Structures. In ' Chas^,X W.C. ed. Visual Information 
Proqessing. Academic Pres§, New York.' 



5} Ri6sbepk, C.K. v and Schank, R.C-. (1976). *- Comprehension by 
Computer: Expectation-- based .Analysis, of- Sentences in 
" Contelxt. Yale Dept. of Comp. Sci. Research Report *#78. 

6] Schank,^R.C. ( 1972 ). Conceptual* Dependency: -A Theory of 
Natural Language Understanding. Cogniti^^g PsA^choloqy 
3(4):5527631, 1972. \ - ; * , . 

7] Schank, R.C. (1973). Identification of Conceptualizations 
Underlying , Natural Canguage. In R. C. Schank'and Colby, 
' eds# Computer Models of Human Thought and * "language. 
^ W. H. Freeman, San Francisco. *. ^^V-^ ^ ; ; 

81* Schank, . R.C. et al. ' ('197-5>V - Conceptual Information 
Processjlng. North Holland,, Amsterdam.' 




9] Schank, R..C. and Abelson, R.P. ^ (1975). Scripts, Plans,'and 
Knowledge. ^ Proceedings .of^-the , Fourth 'International Joint 
Conference-on Artificial Inbfelligence , Tbilisi, USSR. 




10] Schank,. R.C.. and ,AbelsoiC^RVP. (11977). Scripts, Plans, and 
Understanding. Lawrence? Erlbaum i^'sociates, Hillsdald., N.J. 



11] Tripodes, P.G., Greyensteiii , ° S. , Dolan, P.^ Bodnaras, 

Shure, G.H; (1974). t Automatic Content Coding of English 
'Text,- Paper- submitted- f'or' presentation at ACM 1974. San " Diego 
Meetings. . , ~ ' - • 

12] Wilensky, R. • (1976J.' Machine Underst'anding of Human 
V Intentional^t^i Proceedings of the ACM" Anrfual Cdnference. 
) .Houston, Texa^. ' • — 

• • ■ 

I - i . X . 



ERIC ^ . 35 



Page 34 



13] Wmograd, T, (1972) • Understanding Natural La'ng'uage, 
Academic Press, New York, 

14] Woods, W.A.^ (1970).! Transition .Network Grammars foir Natural 
Language Analysis. Comm. ACM 13(10) : 5*91.-606, 1970. 

15] Woods, W. a; ,.arf% Kaplan, R.M. (1971). The Lunar Sciences 
Natural 'Langmage Information System. BBN Report No. 2265. 
Bolt^ B,eranek a^d Newman Inc. Cambridge, Massachusetts'.' - ■ 




Pag-e 35 



>pendix 1 

Sentences processed by the' YALE-WEIS program 
1. LAO FORCES ABANDON BAN-NHIL TO NORTH VIETNAM. 



2. USA NAVY TASK FORCE WHICH HAS BEEN ON PATROL DUTY -iN THE 
> INDIAN OCEAN FOR A MONTH LEAVES tHE AREA. 

3 GU&A GRANTS ASYLUM TO A USA. MARINE. 

4., FRANCE SELLS 50 MIRAGE JET PLANES TO l1bYA#»- 

5. USA APPOLLO 12 ASTRONAUTS VISIT INDONESIA. 

6. ISRAELI TASK FORCE SEIZES UAR RADA^R INSTALLATION ON 'SHADWAN. 

7. LEBANETSE OFFICIALS SEIZED 1500 .RIFLES FROM BULGARIA. 
. 8- GUINEA EXPELS 1 SPANISH CITIZEN. 

9. AUSTRIA EXPELLED 4 CHINESE IN A CONTROVERSY OVER THEIR STATUS 
AND ACTIVITIES. 

10. CASTRO CONDEMNED THE EXECUTION OF THOUSANDS OfLcOMMUNISTS .IN 
INDONESIA. - , ^ 

.... ■ .-^ 

11. SUKARNO EXPLAINED THE- EXPULSION OF THE' USA NEWSMEN! 

12. JORDAN SAID ARABS FAILED THE TEST. 

f .- '> . \'' . '' 

13. ALGERIA PROTESTED TO SPAIN THE DETENTION OF /aN ALGERIAN 
DIPLOMAT. IN CONNECTION WITH MURDER OF AN OPPOSITION LEADER. 

-14.. USA CONCEDE. THAT USA AIR UNITS^MIGHT HAVE hIT ' A CAMBODIAN 
VH4»AGE. I . . ■ 

15. PRIME MINISTER WILSON SENT A NOTE CONCERNING TilE VIETNAM WAR 
/nJO premier KOSYGIN. ' , 

16. VATICAN PRA-ISED UNITED KINGDOM EFFORTS TOWARD PEACE IN 
- VIETNAM. ^-^ . . . 

17. -PRUDENT JOHNSON SENT' CONGRATULATORY MESSAGE TO ^PRIME 
'MINISTER INDIRA GAIJDH I.. . , 

18. THAILAND SAYS IT WILL SOON SEND 1000 TROOPS TO VIETN<?^M. 

19. U'^A, PRESIDENT PROMISED ISRAELI PRIME. MINISTER -HE WOULD GIVE 
CONSIDERATION JO ISRAELI REQUESTS FOR ARMS. 

. ^ ' ^ ■■ ' 

20. SPAIN GIVES BACK TERRITORY OF IFNI TO MOROCCO. * 

■ - ■ ■ V • . / 

21. SPA^N GIVES POSSESSIONS OF HISTORIAN GARCILASp TO PERU. 



37 



* • . • Page 36 

22. UAR FORCES ARE BOLSTERED BY KUWAIT. 

23. US PRESID|WT ANNOUNCED THAT AUSTRIAN CHANCELLOR ACCEPTED US 
INVITATION TO VISIT THE USA. 

24. USA GENERAL SAYS NORTH VIETNAM HAS' UPHEED ' THE BOMBING 
AGREEMENT. . • • ' 

25. SPAIN AND RUMANIA SIGNED AGREEMENT ESTABLISHING FULL CONSULAR 
AND COMMERCIAL RELATIONS. ' • - . • 

26. KENIA SIGNS INTERNATIONAL COFFEE AGREEMENT OF 1962; i 

27. USA, UNITED KINGDOM, NETHERLAND, NORWAY, ASSIGNED WARSHIPS TO 
NEW^ PERMANENT FORCE OF NATO. 

2a. FUNERAL OF INDIA'S SHASTRI' ATTENDED BY USSR J[OSYGIN, USA 
HUMPHREY, UNITED KINGDOM'S BROWN, AFGANISTAN'S MAIMANDA,' 
PAKISTAN'S FARUQUE, AND REPRESENTATIVE OF U THANT. 



2^. CAMBODIA HOSTS USA ASSISTANT SECRETARY OF STATE GREEN FOR A 
BRIEFING ON THE OUTCOME OF US PRESIDENT NIXON VISIT TO CHINA. 

, . *30. SOUTH VIETNAMESE FOREIGN MINISTER^ TRAM' VAN LAM SAYS THE SOUTH 
VIETNAMESE GOVERNMENT APPROVES THE FINAL USA = CHINA 
> ... COMMUNIQUE AND FEELS IT UPHOLDS THE USA COMMITMENTS "TO SOUTH 

WETNAM. ^ ' . >v ^ 

31. US ASSjjfcANT IJECRETARY- FOR EAST ASIAN AFFAIRS MARSHALL GREEN 
REAFFnW^'USA DEFENSE(_COMMITMENT TO TAIWAN AND SAYS THE USA 
WILL CONTINUE • DIPLOMATIC RELATIONS WITH THE TAIV^ANESE 
GOVE-RNfffiNT. ' - Q, 

32. NORTH VIETNAM TO ESTABLISH' FULL DIPLOMATIC RELATIONS WITH 
SWEDEN. 

\ ■ • ^ 

33. USA PROTESTS INDIA'S ABANDONMENT OF NEUTRALI-TY BY 
^ • ESTABLISIJING FULL DIPLOMATIC . RELATIONS WITH NORTH VIETNAM. 

34. TAIWAN/AND UN SltSNED AGREEMENT TO BUILD TYPHOON AND* FLOOD 
WARNING SYSTEM. 

^5. CHINA EXPELLED ITALIAN MISSION BECAUSE TRIP BLESSED BY POPE. 
36. WEST GERMANY CAUGi^ 5 SOVIET CITIZENS SPYING ON WEST GERMANY. 
38.. SYRIA AND ISRAEL EXCHANGE FIRE. ^ 

39. NORTH VIETNAM ASKED THE USSR AND CHINA TO CONTINUE AID XO H'lS 
COUNTRY. . ^ ' 

40. THAI MILITARY SOURCES ACCUSED CAMBODIA OF FIRING ON THAI 
, TERRITORY. 

41. WE^ GERMANY REJECTS USSR CRITICISM OF NATO MANEUVERS. 

E^' , ■ 38 



^ Page 37 ' 

42. CZECHOSLOVAKIA REFUSES TO - LET USA STjJDENTS ENTER 
CZECHOSLOVAKIA. . ^ , * 

43. USSR CANCELS INDONESIAN FOREIGN MINISTER VISIT TO MOSCOW. 

44. "USA PRESIDENT SIGNED EXECUTIVE ORDER TO CUT OFF TRADE WITH' 
RHODESIA. X *^ • . ■ 

45. SOUTH KOREA SMASHES 7 .NORTH KOREAN ESPIONAGE ^INGS INVOLVINp 
9 SPIES AND 14 COLLABORATORS IN SEOUL, TAEGU, AND THE. EASTERN 
PORT OF POHANG/ . ' ' 

46. HONDURAS SAID IT? HAD "EXPELLED SOME SAi,VAD0RIANS FOR ILLEGAL 

V Immigration. . ^ 

47. CHINA -DEMONSTRATES IN PEKING *AT USSR. EMBASSY. . " 
48.. USSR MILITARY UNiTS PARTICIPATE IN MONGOLIAN PARADE. 

49, NIGERIA. TAKES BIAFRAN PROVISIONAL CAPITAL OF OWERRI. 
50.. LEBANESE TOWNSPEOPLE SET FIRE TO ARAB COMMANDO OF-FICE. 



> t s 



o ' * .39 
ERIC 



